REMARKS 



Claims 1, 4-5, 7-11, 14-15, 17-23, and 25-28 are pending in the present application. 
The Examiner has rejected claims 1, 4-5, 7-11, 14-15, 17-23, and 25-28 under 35 U.S.C. 
§ 1 03(a) as being obvious over U.S. Patent No. 6,08 1 ,780 (Lumelsky) in view of Applicant's 
Admitted Prior Art (AAPA). 

The Examiner alleges that Lumelsky discloses a text-to-speech and prosody based 
authoring system that includes a speech analyzer that generates a speech signal representative 
of one or more prosodic parameters associated with a speaker and a text-to-speech converter 
responsive to a text signal that generates a phonetic representation signal and synthesizes a 
speech signal from the text signal. The Examiner then alleges that it would be obvious to 
implement alignment processing in Lumelsky' s system for improved marked text and 
generating more natural speech. 

Applicant respectfully disagrees. 

Lumelsky is directed to enabling content providers with authoring tools to provide a 
highly compressed voice content. A singlecast interactive radio system is disclosed that 
provides a user with the ability to interactively receive a vast amount of information, as well 
as send information, across a single radio channel uniquely established between information 
content providers and system users. This system provides a human-authored text-to-speech 
(TTS) system that performs the following steps. First, an incoming text signal represented by 
ASCII codes is fed to a prosody analysis section. The text is normalized to expand common 
abbreviations, number sequences, acronyms, etc. The text is then syntactically parsed in 
order to identify parts of speech for each word. A morphological analysis of words is 
performed, using a morphs (word roots) dictionary in conjunction with any conventional 
affix-stripping algorithm. A prosodic synthesis step uses prosody rules to quantify 
relationships among sentences and ideas in a paragraph. A sequence of phonetic and 
prosodic symbols is generated, representing phonemes, prosodic and syllabic information. 
This phonetic transcription is considered ready for transmission to a client. A 
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personalization process starts with generation of speech by a speech synthesis process using a 
speech synthesizer, allophone context tables converter and one or more dictionaries. A 
spectrum comparison process analyzes the spectrum and timing parameters of both synthetic 
speech, as well as the original narrator-provided speech. After several iterations by 
corrective feedback between the spectrum comparison process and the prosody synthesis 
process, whereby the original speech and synthesized speech errors are minimized, a 
composite encoded sequence (CES) editor receives the prosodically adjusted phonetic 
transcription data. The CES editing process employs a graphic or text editor which 
assembles all the analysis data in a convenient form for representing it to an operator on a 
terminal screen. As a result of this speech authoring process, the speech signal output at the 
user terminal sounds like the speech produced by the human being on whom the allophone 
dictionary was based. 

Thus, Lumelsky' s input is an ASCII text string, not a text string . . . and 
corresponding spoken audio signal, as essentially recited in claims 1 , 1 1 , and 2 1 . Moreover, 
Lumelsky's system does not perform extracting acoustic feature data from said audio signal, 
but rather from the allophone context tables converter and one or more dictionaries. 
Furthermore, there is no teaching or suggestion in Lumelsky of outputting a set of duration 
contours indicative of the duration of each word and phoneme; extracting pitch contour 
parameters from said audio spoken input; and generating a marked-up text corresponding to 
the spoken utterance using the pitch contour and duration parameters, as Lumelsky 
synthesizes speech using the aforementioned allophone context tables converter and one or 
more dictionaries. Although the Examiner has cited AAPA in combination with Lumelsky in 
rejecting independent claims 1, 1 1, and 21, the Examiner's only reference to AAPA is with 
respect to the Viterbi alignment process, which is recited in dependent claims 5 and 1 5, and 
implementing SSML for Internet use, which is recited in claims 9 and 19. If the Examiner 
insists on citing AAPA against independent claims 1,11, and 21, she is invited to provide 
specific examples relating to limitations in those independent claims. 
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Thus, Applicant urges that the combination of Lumelsky and AAPA does not teach or 
suggest all limitations of independent claims 1,11, and 2 1 , and therefore that a prima facie 
case of obviousness of those claims over Lumelsky and AAPA cannot be maintained. 
Reconsideration and withdrawal of these rejections are respectfully requested. 

Claims 4-5, 7-10, 14-15, 17-20, 22-23, and 25-28 all depend from either claims 1, 11, 
or 2 1 , respectively, and are thus patentable for at least the same reasons as claims 1,11, and 
21 . Reconsideration and withdrawal of these rejections are respectfully requested. 
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CONCLUSION 



Applicant urges that claims 1,4-5,7-1 1, 14-15, 17-23, and 25-28 are in condition for 
allowance for at least the reasons stated. Early and favorable action on this case is 
respectfully requested. 
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